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Abstract 


Building  HVAC  equipment  routinely  fails  to  satisfy  performance  expectations  envisioned  at 
design.  Such  failures  often  go  unnoticed  for  extended  periods  of  time.  Additionally,  higher 
expectations  are  being  placed  on  a combination  of  different  and  often  conflicting  performance 
measures,  such  as  energy  efficiency,  indoor  air  quality,  comfort,  reliability,  limiting  peak 
demand  on  utilities,  etc.  To  meet  these  expectations,  the  processes,  systems,  and  equipment  used 
in  both  commercial  and  residential  buildings  are  becoming  increasingly  sophisticated.  This 
development  both  necessitates  the  use  of  automated  diagnostics  to  ensure  fault-free  operation 
and  enables  diagnostic  capabilities  for  the  various  building  systems  by  providing  a distributed 
platform  that  is  powerful  and  flexible  enough  to  perform  fault  detection  and  diagnostics  (FDD). 

The  purpose  of  the  research  effort  described  in  this  report  is  to  develop,  test,  and  demonstrate 
FDD  methods  that  can  detect  common  mechanical  faults  and  control  errors  in  air-handling  units 
(AHUs)  and  variable-air-volume  (VAV)  boxes.  The  tools  are  intended  to  be  sufficiently  simple 
that  they  can  be  embedded  in  commercial  building  control  systems  and  rely  upon  only  sensor 
data  and  control  signals  that  are  commonly  available  in  commercial  building  automation  and 
control  systems. 

AHU  Performance  Assessment  Rules  (APAR)  is  a diagnostic  tool  that  uses  a set  of  expert  rules 
derived  from  mass  and  energy  balances  to  detect  faults  in  air-handling  units.  Control  signals  are 
used  to  determine  the  mode  of  operation  for  the  AHU.  A subset  of  the  expert  rules  corresponding 
to  that  mode  of  operation  is  then  evaluated  to  determine  if  there  is  a mechanical  fault  or  a control 
problem.  VAV  box  Performance  Assessment  Control  Charts  (VP ACC)  is  a diagnostic  tool  that 
uses  statistical  quality  control  measures  to  detect  faults  or  control  problems  in  VAV  boxes. 

This  report  describes  a research  study  of  embedding  APAR  and  VP  ACC  in  HVAC  equipment 
controllers  in  a laboratory  setting.  APAR  was  embedded  in  several  air  handling  unit  controllers 
and  evaluated  in  an  emulation  environment,  while  VP  ACC  was  embedded  in  several  VAV  box 
controllers  and  evaluated  in  a laboratory  environment.  APAR  and  VP  ACC  were  both  found  to 
be  successful  at  finding  a wide  variety  of  mechanical  and  control  faults.  Both  tools  appear  to  be 
suitable  for  embedding  in  commercial  control  products. 

Key  words:  BACnet,  building  automation  and  control,  direct  digital  control,  energy  management 
systems,  fault  detection  and  diagnostics,  cybernetic  building  systems 
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1 Introduction 


Building  HVAC  equipment  routinely  fails  to  satisfy  performance  expectations  envisioned 
at  design.  Such  failures  often  go  unnoticed  for  extended  periods  of  time.  Additionally, 
higher  expectations  are  being  placed  on  a combination  of  different  and  often  conflicting 
performance  measures,  such  as  energy  efficiency,  indoor  air  quality,  comfort,  reliability, 
limiting  peak  demand  on  utilities,  etc.  To  meet  these  expectations,  the  processes,  systems, 
and  equipment  used  in  both  commercial  and  residential  buildings  are  becoming 
increasingly  sophisticated.  This  development  both  necessitates  the  use  of  automated 
diagnostics  to  ensure  fault-free  operation  and  enables  diagnostic  capabilities  for  the 
various  building  systems  by  providing  a distributed  platform  that  is  powerful  and  flexible 
enough  to  perform  fault  detection  and  diagnostics  (FDD). 

Most  of  today’s  emerging  FDD  tools  are  stand-alone  software  products  that  do  not  reside 
in  a building  control  system.  Thus,  trend  data  files  must  be  processed  off-line,  or  an 
interface  to  the  building  control  system  must  be  developed  to  enable  on-line  analysis. 
This  does  not  scale  well  because  all  of  the  data  must  be  obtained  at  a single  point.  A 
better  solution  is  to  embed  FDD  in  the  local  controller  for  each  piece  of  equipment,  so 
that  the  FDD  algorithm  is  executed  as  a component  of  the  control  logic.  NIST  has 
developed  FDD  methods  that  can  detect  common  mechanical  faults  and  control  errors  in 
air-handling  units  (AHUs)  and  variable-air- volume  (VAV)  boxes.  The  tools  are 
sufficiently  simple  that  they  can  be  embedded  in  commercial  building  control  systems 
and  only  rely  upon  sensor  data  and  control  signals  that  are  commonly  available  in 
commercial  building  automation  and  control  systems.  AHU  Performance  Assessment 
Rules  (APAR)  and  VAV  box  Performance  Assessment  Control  Charts  (VP ACC)  have 
been  designed  “from  the  ground  up”  to  be  embedded  in  commercial  HVAC  equipment 
controllers. 

A previous  study  [1],  describes  the  theoretical  basis  of  APAR  and  VP  ACC  and  evaluates 
the  tools’  performance  using  data  generated  by  simulation,  emulation,  and  laboratory 
testing.  The  study  examined  the  breadth  of  faults  that  can  be  detected  and  the  conditions 
under  which  they  can  be  detected.  This  report  describes  the  results  of  a research  study  to 
evaluate  the  feasibility  of  embedding  APAR  and  VP  ACC  in  HVAC  equipment 
controllers  in  a laboratory  setting. 
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2 Methodology 

2.1  FDD  for  Air  Handling  Units 

The  fault  detection  tool  described  in  this  section  was  developed  for  application  to  single 
duct  variable- volume  or  constant-volume  air  handlers  with  hydronic  heating  and  cooling 
coils  and  airside  economizers.  The  rules  that  are  used  for  FDD  focus  on  temperature 
control  in  an  AHU.  Hence,  the  system  description  will  be  restricted  to  components  and 
control  strategies  directly  related  to  temperature  control.  Figure  2.1  is  a schematic 
diagram  of  a typical  single  duct  air  handling  unit  (AHU). 


Figure  2.1:  Schematic  diagram  of  a single  duct  air-handling  unit 


2.1.1  System  Description 

The  AHU  controller  typically  controls  the  supply  air  temperature  to  maintain  a setpoint 
temperature  at  a location  in  the  supply  duct  downstream  of  the  supply  fan.  Outdoor  air 
enters  the  AHU  and  is  mixed  with  air  returned  from  the  building.  A single  mixing  box 
damper  control  signal  is  mapped  to  the  outdoor  air  damper,  the  exhaust  air  damper,  and 
the  recirculation  air  damper.  A control  signal  of  100  % means  that  the  dampers  are 
positioned  for  100  % outdoor  air  (outdoor  and  exhaust  air  dampers  are  fully  open  and 
recirculation  air  damper  is  fully  closed);  a control  signal  of  0 % means  that  the  dampers 
are  aligned  for  0 % outdoor  air  (outdoor  and  exhaust  air  dampers  are  fully  closed  and 
recirculation  air  damper  is  fully  open).  The  mixed  air  passes  over  the  heating  and  cooling 
coils,  where  if  necessary,  it  is  conditioned  prior  to  being  supplied  to  the  building.  The 
typical  operating  sequence  for  AHUs  consists  of  four  primary  modes  of  operation  during 
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occupied  periods  for  maintaining  the  supply  air  temperature  and  the  ventilation  at  preset 
levels.  The  relationship  of  the  four  operating  modes  to  the  control  of  the  heating  coil 
valve,  the  cooling  coil  valve  and  the  mixing  box  dampers  is  shown  in  Figure  2.2. 
Sequencing  logic  determines  the  mode  of  operation  as  dictated  by  various  thermal 
relationships  including  the  internal  and  external  loads  on  the  zones  served  by  the  AHU. 

In  the  heating  mode  (Mode  1 in  Figure  2.2),  the  heating  coil  valve  is  controlled  to 
maintain  the  supply  air  temperature  at  the  heating  set  point  and  the  cooling  coil  valve  is 
closed.  The  mixing  box  dampers  are  positioned  to  allow  the  minimum  outdoor  air 
necessary  to  satisfy  ventilation  requirements. 
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Figure  2.2:  Typical  operating  modes  of  an  air-handling  unit 

As  cooling  loads  increase,  the  AHU  transitions  from  heating  to  cooling  with  outside  air 
(Mode  2).  In  this  mode,  the  heating  and  cooling  coil  valves  are  closed  and  the  mixing  box 
dampers  are  modulated  to  maintain  the  supply  air  temperature  at  cooling  set  point.  As  the 
loads  continue  to  increase,  the  mixing  dampers  eventually  saturate  with  the  outdoor  air 
damper  fully  open  and  the  AHU  changes  over  to  mechanical  cooling.  When  the  AHU  is 
operating  in  one  of  the  mechanical  cooling  modes  (Modes  3 and  4),  the  cooling  coil  valve 
modulates  to  maintain  the  supply  air  temperature  at  cooling  set  point,  the  heating  coil 
valve  is  closed,  and  the  outdoor  air  damper  is  either  fully  open  or  at  its  minimum 
position.  There  are  several  different  types  of  economizer  controls,  generally  the 
economizer  control  logic  uses  a comparison  of  the  outdoor  and  return  air  temperatures  or 
enthalpies  to  determine  the  proper  position  of  the  outdoor  air  damper  such  that 
mechanical  cooling  requirements  are  minimized.  Hence,  the  third  primary  mode  (Mode 
3)  of  operation  is  mechanical  cooling  with  100  % outdoor  air  and  the  fourth  primary 
mode  (Mode  4)  of  operation  is  mechanical  cooling  with  minimum  outdoor  air. 


2.1.2  AHU  Performance  Assessment  Rules  (APAR) 

The  basis  for  the  fault  detection  methodology  is  a set  of  expert  rules  used  to  assess  the 
performance  of  the  AHU.  The  tool  developed  from  these  rules  is  referred  to  as  APAR 
(AHU  Performance  Assessment  Rules).  APAR  uses  control  signals  and  occupancy 
information  to  identify  the  mode  of  operation  of  the  AHU,  thereby  identifying  a subset  of 
the  rules  that  specify  temperature  relationships  that  are  applicable  for  that  mode.  The  two 
main  mode  classifications  are  occupied  and  unoccupied.  For  occupied  periods,  the  mode 
is  further  categorized  as  described  in  the  previous  paragraph.  For  convenience,  the 
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operating  modes  are  summarized  below: 

• Mode  1 : heating 

• Mode  2:  cooling  with  outdoor  air 

• Mode  3:  mechanical  cooling  with  100  % outdoor  air 

• Mode  4:  mechanical  cooling  with  minimum  outdoor  air 

• Mode  5:  unknown 

Because  the  direct  digital  control  (DDC)  output  to  the  actuators  of  the  heating  and 
cooling  coil  valves  and  the  mixing  box  dampers  are  known,  the  mode  of  operation  can  be 
ascertained.  Although  not  depicted  in  Figure  2.2,  a fifth  mode  of  operation  referred  to 
“unknown”  operation  has  been  defined  and  listed  above.  The  unknown  mode  applies  to 
the  case  in  which  the  AHU  is  running  in  an  occupied  mode,  but  none  of  the  control 
output  relationships  defined  for  Modes  1-4  are  satisfied.  The  unknown  mode  could  be 
associated  with  mode  transitions  and/or  with  faulty  operation  such  as  simultaneous 
heating  and  cooling. 

Once  the  mode  of  operation  has  been  established,  rules  based  on  conservation  of  mass 
and  energy  can  be  used  along  with  the  sensor  information  that  is  typically  available  for 
controlling  the  AHUs.  For  example,  normal  operation  in  the  mechanical  cooling  mode 
with  100  % outdoor  air  (Mode  3)  dictates  that  the  outdoor  and  mixed  air  temperatures 
must  be  approximately  equal.  Defining  Toa  and  Tma  as  the  outdoor  air  and  mixed  air 
temperatures,  respectively,  the  rule  (defined  as  Rule  10)  is  written  as 

Rule  10:  I Toa  - T„J  > $ 

where  ^ is  a threshold  that  depends  on  the  uncertainty  (or  accuracy)  of  the 
measurements.  The  rules  are  written  such  that  a fault  is  indicated  if  a rule  is  true.  In  the 
example  above,  the  rule  states  that  if  the  outdoor  and  mixed  air  temperatures  are  not  the 
same  (i.e.,  if  true)  a fault  has  occurred. 

As  a detailed  description  of  the  28  APAR  rules  and  the  reasoning  behind  them  is 
available  elsewhere  [2],  the  rules  are  simply  listed  in  Table  2.1  without  detailed 
explanation.  Table  2.1  groups  the  rules  according  to  mode  of  operation.  As  indicated  in 
the  column  heading  for  the  rule  expression,  a true  expression  indicates  a fault.  Table  2.2 
presents  the  rules  as  related  groups  and  indicates  the  sensors  and  control  signals  used  to 
evaluate  each  rule.  The  first  group  of  rules  treats  the  relationship  of  temperatures  in  the 
coil  subsystem  of  the  AHU.  For  these  four  rules,  only  the  relational  operator  in  the  rules 
change  from  one  mode  to  another.  A typical  rule  from  this  subgroup  requires  the  supply 
air  temperature  to  be  lower  than  the  sum  of  the  mixed  air  temperature  and  the 
temperature  rise  across  the  supply  fan  in  the  mechanical  cooling  modes.  There  are  also 
groups  of  rules  treating  the  mixing  box  subsystem,  the  zone  subsystem,  economizer 
operation,  comfort  requirements,  and  controller  logic/tuning.  Hence,  although  there  are 
28  rules,  in  reality  only  a small  number  of  temperature  and  control  signal  relationships 
are  used  to  define  the  rules. 
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Table  2.1:  APAR  Rule  Set 


Mode 

Rule  # 

Rule  Expression  (true  implies  existence  of  a fault) 

Heating 
(Mode  1) 

1 

Tsa  < Tma  + ^Tsf  5 

2 

For  1 Tra  - Toa  1 > AT min.  \Q0c/Qsa  ' (Qoc/Qsa)min  ^ > 

3 

\uhc  ~ T — ^hc  anc^  Tsclys  ~ Tsa  ^ £t 

4 

^uhc  ~ T — ^hc 

Cooling  with 
Outdoor  Air 
(Mode  2) 

5 

Toa  > Tsa  s - AT + £t 

6 

Tsa  > Tra  - ATrf  + £t 

7 

\Tsa  ' ATsf- Tma\  > £j 

Mechanical 
Cooling  with 
100%  Outdoor 
Air 

(Mode  3) 

8 

Toa  < Tsa  s - ATsj?  - £t 

9 

Toa  ^ Tco  + 

10 

\T0a  ~ Tma)  > Q 

11 

Tsa  > Tma  + ATsf  + £t 

12 

Tsa  > Tra  - ATrf  + £j 

13 

cc  ~ T — £qc  Tsa  — Tsa  s > £f 

14 

~ 1 1 — Tqc 

Mechanical 
Cooling  with 
Minimum 
Outdoor  Air 
(Mode  4) 

15 

Toa  < Tco  - £t 

16 

Tsa  > Tma  + ^Tsf  + $ 

17 

TSa  > Tra  ~ ATrf  + £t 

18 

For  \Tm  - Toa  1 >ATmin:  \Qoc/Qsa  - (Qoc/Qsa>min  1 > £f 

19 

cc  ~ T — ^cc  TSa  ~ TSays  — 

20 

^cc  ~ 1 ^ — ^cc 

Unknown 

Occupied 

Modes 
(Mode  5) 

21 

ucc  ^ ^cc  Ufoc  ^ £fic  £rf  < u. < 1 ~ £d 

22 

M he  ^ ^ he  ucc  ^ ^ cc 

23 

uhc  ^ ^hc  ^ 

24 

£d  < Wj  < 1 - £rf  and 

All  Occupied 
Modes 

(Mode  1,  2,  3,  4, 
or  5) 

25 

1 -Tsa  — Tsa>s  1 > 

26 

Tma  min(Tra  > Toa)  - £t 

27 

Tma  ^ y^O-^(T r a > T q a)  + £jr 

28 

Number  of  mode  transitions  per  hour  > MTmax 

9 


Where 

MT 


max 


sa 


ma 


ra 


oa 


co 


sa,s 


AT 


sf 


AT , 


rf 


AT, 


min 


Qoc/Q  sa 
( Qoc/Q  sa) min 
uhc 


lcc 


ud 


$ 

& he 
^cc 
£d 


maximum  number  of  mode  changes  per  hour 

supply  air  temperature 

mixed  air  temperature 

return  air  temperature 

outdoor  air  temperature 

changeover  air  temperature  for  switching  between  Modes  3 and  4 
supply  air  temperature  set  point 
temperature  rise  across  the  supply  fan 
temperature  rise  across  the  return  fan 

threshold  on  the  minimum  temperature  difference  between  the 
return  and  outdoor  air 

outdoor  air  fraction  = ( T ^ - Tra)/(Toa  - Tra) 
threshold  on  the  minimum  outdoor  air  fraction 
normalized  heating  coil  valve  control  signal  [0,1]  where  u^c  = 0 
indicates  the  valve  is  closed  and  u^c  = 1 indicates  it  is  100  % open 
normalized  cooling  coil  valve  control  signal  [0,1]  where  ucc  = 0 
indicates  the  valve  is  closed  and  ucc  = 1 indicates  it  is  100  % open 
normalized  mixing  box  damper  control  signal  [0,1]  where  ud  = 0 
indicates  the  outdoor  air  damper  is  closed  and  ud  = 1 indicates  it  is 
100  % open 

threshold  for  errors  in  temperature  measurements 
threshold  parameter  accounting  for  errors  related  to  airflows 
(function  of  uncertainties  in  temperature  measurements) 
threshold  parameter  for  the  heating  coil  valve  control  signal 

threshold  parameter  for  the  cooling  coil  valve  control  signal 
threshold  parameter  for  the  mixing  box  damper  control  signal 


2. 1.2.1  Operational  and  Design  Data  Requirements 

APAR  uses  the  following  occupancy  information,  setpoint  values,  sensor  measurements, 
and  control  signals: 


Occupancy  status; 

Supply  air  temperature  set  point; 
Supply  air  temperature; 

Return  air  temperature; 

Mixed  air  temperature; 

Outdoor  air  temperature; 

Cooling  coil  valve  control  signal; 
Heating  coil  valve  control  signal; 


• Mixing  box  damper  control 
signal; 

• Return  air  relative  humidity 
(for  enthalpy-based  economizers 
only) 

• Outdoor  air  relative  humidity 
(for  enthalpy-based  economizers 
only). 
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Table  2.2:  Summary  of  Rule  Relationships 


Rule 

Mode1 

Sensors  and  Control  Signals2 

Relationship  Between  Grouped  Rules 

Tsa 

Tra 

Tma 

Toa 

Tsa.s 

ATsf 

ATr, 

Tco 

Ucc 

Uhc 

Ud 

i 

i 

/ 

/ 

/ 

Coil  Subsystem:  The  relational  sign  (<,  >,  etc.) 
changes  based  on  the  mode  of  operation. 

7 

2 

/ 

/ 

/ 

11 

3 

/ 

/ 

/ 

16 

4 

/ 

/ 

2 

1 

/ 

/ 

/ 

Mixing  Box  Subsystem:  Rules  are  related 
through  calculation  of  outdoor  air  fraction.  If 

Rule  26  or  27  is  satisfied,  the  outdoor  air 
fraction  will  be  negative  or  greater  than  unity. 

18 

4 

/ 

/ 

/ 

26 

1,2,  3,4 

/ 

/ 

/ 

27 

1,2,  3,4 

/ 

/ 

/ 

25 

1,2,  3,4 

/ 

/ 

Comfort  Requirements:  The  first  four  rules 
indicate  comfort  is  sacrificed  (with  Rules  3,  13, 
and  19  indicating  the  system  is  out  of  control), 
whereas  the  latter  three  rules  indicate  comfort 
could  soon  be  sacrificed  (system  is  out  of 
control). 

3 

1 

/ 

/ 

/ 

13 

3 

/ 

/ 

/ 

19 

4 

/ 

/ 

/ 

4 

1 

/ 

14 

3 

/ 

20 

4 

/ 

5 

2 

/ 

/ 

/ 

The  relational  sign  (<,  >,  etc.)  changes  based 
on  the  mode  of  operation. 

8 

3 

/ 

/ 

/ 

6 

2 

/ 

/ 
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1 The  dash  symbol  indicates  either  an  unknown  mode  or  multiple  modes  of  operation. 

2 The  checked  box  symbol  indicates  which  quantities  are  compared  for  the  given  rule. 


This  information  is  generally  available  for  most  AHUs  controlled  with  a DDC  system.  If  one  or 
more  sensors  are  not  available,  certain  rules  will  no  longer  be  applicable.  For  instance,  in  the 
absence  of  a mixed  air  temperature  sensor,  nine  rules  listed  in  Table  2.1  (Rules  1,  2,  7,  10,  11, 
16,  18,  26,  and  27)  will  be  eliminated  from  consideration  in  APAR.  Conversely,  the  presence  of 
additional  sensors  would  expand  the  rule  set  and  provide  an  opportunity  to  either  detect  more 
faults,  or  to  detect  faults  during  modes  of  operation  in  which  they  would  normally  be  hidden.  For 
instance,  if  a temperature  sensor  was  installed  between  the  heating  and  cooling  coils,  leakage 
through  the  heating  valve  could  be  detected  during  the  mechanical  cooling  modes,  whereas 
normally  it  would  be  masked  in  these  modes. 


In  addition  to  the  operational  data  listed  above,  certain  design  data  are  needed  to  implement  the 
rules.  The  required  design  data  are: 


• Minimum  and  maximum  values  of  control  signals  for  the  heating  coil  valve,  cooling  coil 
valve  and  mixing  box  dampers  for  normalizing  the  control  signals; 

• Percentage  outdoor  air  necessary  to  satisfy  ventilation  requirements; 

• Changeover  temperature  from  mechanical  cooling  with  100%  outdoor  air  to  mechanical 
cooling  with  minimum  outdoor  air  (or  equivalent  condition  for  enthalpy-based  economizer); 

• Description  of  sequencing/economizer  cycle  strategy  (used  to  verify  that  the  rules  are 
suitable  to  a particular  AHU  installation). 
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2. 1.2.2  Detecting  and  Diagnosing  Faults 

APAR  does  not  search  for  the  existence  of  a specific  set  of  faults.  Rather,  any  fault  that  causes  a 
rule  to  be  satisfied  would  be  detected  and  additional  effort  would  be  necessary  to  isolate  the 
source  of  the  problem.  Emulation,  simulation,  and  laboratory  experimentation  from  previous 
work  [1]  as  well  as  the  current  study  demonstrate  that  the  rule  set  can  identify  the  following 
faults: 

• Stuck  or  leaking  mixing  box  dampers,  heating  coil  valves,  and  cooling  coil  valves; 

• Temperature  sensor  faults; 

• Design  faults  such  as  undersized  coils; 

• Controller  programming  errors  related  to  tuning,  setpoints,  and  sequencing  logic; 

• Inappropriate  operator  intervention. 

The  operating  point,  severity  of  a fault,  and  threshold  selection  for  the  rules  will  obviously 
influence  when  a particular  rule  is  satisfied.  Threshold  selection  is  discussed  next. 

2. 1.2.3  Threshold  Selection 

In  addition  to  the  sensor,  control  signals,  and  setpoint  information,  there  are  other  parameters 
that  must  be  specified  for  APAR.  For  instance,  estimates  of  the  temperature  rise  across  the 
supply  fan  (and  return  fan,  if  one  exists)  must  be  provided,  a reasonable  default  is  1.1  °C  (2.0  °F). 
A model-based  value  correlated  to  the  airflow  rate  or  the  control  signal  to  the  fan  could  be  used 
as  the  basis  for  this  estimate;  however,  some  amount  of  training  data  would  likely  be  necessary 
to  establish  the  correlation.  Thresholds  used  in  evaluation  of  rules  such  as  et  in  Rule  10  must  also 
be  specified.  Another  approach  might  be  to  calculate  the  threshold  values  based  on  the 
uncertainty  of  each  sensor  or  actuator  value.  As  an  example,  the  threshold  in  Rule  10  would  be 
determined  from  the  expression 

£t  — £t  + Ef 
1 L oa  1 ma 

where  £r  and  are  the  uncertainties  associated  with  the  measurement  of  the  outdoor  and 

1 oa  L ma 

mixed  air  temperatures.  If  a threshold  is  too  great,  the  associated  fault(s)  must  be  relatively 
severe  to  be  detected.  If,  on  the  other  hand,  a threshold  is  too  small,  normal  variation  in 
operating  conditions  may  result  in  false  alarms.  These  threshold  values  were  determined 
heuristically  for  each  site  of  the  sites  in  this  study. 

2.1.3  Instrumentation  Accuracy  Requirements 

APAR  uses  existing  sensor  points  in  the  control  system  to  perform  the  fault  detection 
calculations.  Previous  work  [1]  demonstrated  that  the  typical  industrial  grade  sensors  that  are 
already  installed  for  control  purposes  have  sufficient  accuracy.  Laboratory  grade  instruments  are 
not  required.  Higher  quality  sensors  that  have  been  installed  and  calibrated  properly  will  allow 
the  use  of  tighter  thresholds  (less  severe  faults  can  be  detected)  than  lower  quality  sensors,  or 
those  that  have  been  poorly  calibrated  or  installed. 
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2.2  FDD  for  YAV  Boxes 


2.2.1  VAV  box  Performance  Assessment  Control  Charts  - VPACC 

The  primary  purpose  of  heating,  ventilating,  and  air-conditioning  (HVAC)  equipment  in 
commercial  buildings  is  to  provide  a comfortable  and  healthy  environment  for  occupants. 
Variable-air- volume  (VAV)  air  handling  systems  are  common  for  conditioning  air  and  delivering 
the  air  to  occupied  zones.  VAV  boxes  are  an  integral  part  of  such  systems  and  are  the  final  piece 
of  equipment  that  air  passes  through  prior  to  reaching  the  occupants.  As  such,  it  is  important  to 
ensure  that  these  devices  operate  correctly. 

The  challenges  presented  in  detecting  and  diagnosing  faults  in  VAV  boxes  are  similar  to  those 
encountered  with  other  pieces  of  HVAC  equipment.  Generally  there  are  very  few  sensors, 
making  it  difficult  to  ascertain  what  is  happening  in  the  device.  Limitations  associated  with 
controller  memory  and  communication  capabilities  further  complicate  the  task.  The  number  of 
different  types  of  VAV  boxes  and  lack  of  standardized  control  sequences  add  a final  level  of 
complexity  to  the  challenge.  These  constraints  make  the  development  of  VAV  box  fault 
detection  tools  difficult,  but  the  quantity  of  VAV  boxes  makes  the  effort  worthwhile,  due  to  the 
impact.  For  instance,  buildings  may  have  ten  to  fifteen  times  more  VAV  boxes  than  air-handling 
units.  Hence,  maintenance  staffs  would  clearly  benefit  from  a tool  that  assisted  them  in 
monitoring  VAV  box  operation. 

The  needs  and  constraints  described  above  have  led  to  the  development  of  VAV  Box 
Performance  Assessment  Control  Charts  (VPACC),  a fault  detection  tool  that  uses  a small 
number  of  control  charts  to  assess  the  performance  of  VAV  boxes.  The  underlying  approach, 
while  developed  for  a specific  type  of  VAV  box  and  control  sequence,  is  general  in  nature  and 
can  be  adapted  to  other  types  of  VAV  boxes.  This  section  describes  the  basic  concept  of  control 
charts  and  their  use  for  determining  when  control  processes  have  gone  “out  of  control”.  The 
specific  control  charts  developed  and  implemented  in  VPACC  are  then  presented  for  a single 
duct  pressure-independent  throttling  VAV  box  with  reheat. 

2.2.2  Control  Charts 

Control  charts  are  common  tools  for  monitoring  control  processes  wherein  a measured  quantity 
is  compared  to  upper  and  lower  limits  that  define  allowable  (or  fault  free)  operation.  If  the 
measured  quantity  falls  outside  these  limits,  the  process  is  said  to  be  “out  of  control.”  The  limits 
are  typically  defined  using  statistical  parameters  and,  therefore,  control  charts  are  often  referred 
to  as  statistical  quality  control  charts. 

There  are  many  different  types  of  control  charts.  VPACC  implements  an  algorithm  known  as  a 
CUSUM  (cumulative  sum)  chart.  The  basic  concept  behind  CUSUM  charts  is  to  accumulate  the 
error  between  a process  output  and  the  expected  value  of  the  output.  Large  values  of  the 
accumulated  error  are  indicative  of  an  out  of  control  process.  With  the  process  output  at 
sampling  time  i denoted  xj,  the  estimate  of  the  expected  value  denoted  x , and  the  estimate  of  the 
process  standard  deviation  denoted  by  & , the  normalized  process  output  is  given  by: 
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(1) 


The  normalized  process  output  is  used  to  compute  two  cumulative  sums  defined  as  follows: 
Sj  = max  [ 0,  Zj  - k + Sj_j  ] (2a) 

Tj  = min  [ 0,  + k + TM  ] (2b) 


where  k is  a slack  parameter  that  must  be  specified.  Positive  values  of  z greater  than  k cause  the 
sum  S to  move  away  from  zero  and  the  sum  T to  approach  or  remain  at  zero.  Negative  values  of  z 
less  than  -k  cause  the  sum  T to  move  away  from  zero  and  the  sum  S to  approach  or  remain  at 
zero.  A process  is  said  to  be  out  of  control  when  either  S exceeds  a threshold  value  defined  by 
the  parameter  h,  or  T falls  below  -h.  Figure  2.3  [3]  presents  normalized  data  and  the  S and  T 
cumulative  sums  for  k = 0.5  and  h = 5.  The  first  20  data  points  come  from  a random  normal 
distribution  with  a mean  value  of  zero  and  a standard  deviation  of  unity.  The  mean  value  is  then 
increased  to  0.25,  0.5,  0.75  and  1.0  for  subsequent  sets  of  20  data  points.  Note  that  S exceeds  the 
threshold  value  of  h after  about  68  data  points.  Because  the  mean  value  increases  above  0,  the 
cumulative  sum  T remains  above  its  threshold  of  -5. 


CUSUM  charts  are  generally  considered  to  be  effective  for  detecting  gradual  shifts  in  the  process 
mean.  The  most  commonly  used  control  charts  are  Shew  hart  and  Shewhart-type  charts.  Shew  hart 
charts  are  effective  for  detecting  large,  sudden  changes  in  the  process  mean.  Generally  Shewhart 
chart  limits  are  set  at  values  of  x ± 3 a . In  terms  of  the  normalized  parameter  z,  the  chart  limits 
are  z = ± 3 . Shewhart  charts  were  not  investigated  as  part  of  this  study;  however,  it  is  interesting 
to  note  that  the  basic  CUSUM  and  Shewhart  charts  are  equivalent  if  the  CUSUM  parameters  k 
and  h are  selected  as  k = 3 and  h = 0. 
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Figure  2.3:  A simple  CUSUM  control  chart  indicating  an  “out  of  control”  process. 
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2.2.3  System  Description 

Figure  2.4  is  a schematic  diagram  of  a typical  single  duct  variable-air-volume  (VAV)  box  with 
hydronic  reheat.  The  diagram  depicts  a damper  that  is  used  to  modulate  airflow  to  the  zone  and  a 
control  valve  that  modulates  hot  water  flow  to  the  reheat  coil.  Several  sensors  are  also  shown  in 
Figure  2.4.  The  zone  thermostat  measures  the  air  temperature  in  the  zone.  The  differential 
pressure  transducer  is  used  to  measure  the  flow  rate  of  air  into  the  zone.  Finally,  the  discharge  air 
temperature  sensor  measures  the  temperature  of  the  air  stream  entering  the  zone.  This  sensor  is 
used  to  provide  diagnostic  information  rather  than  for  control  purposes.  The  VAV  box  controller 
reads  the  sensor  information,  computes  control  outputs  for  the  damper  and  reheat  valve,  and 
transmits  these  signals  to  the  appropriate  actuators. 


Figure  2.4:  Schematic  diagram  of  a single  duct  pressure-independent  VAV  box  with 
hydronic  reheat. 

Figure  2.5  shows  a typical  control  sequence  for  a pressure-independent  VAV  box.  A heating  set 
point  and  a cooling  set  point  are  specified.  As  the  zone  temperature  increases  above  the  cooling 
set  point,  the  airflow  rate  to  the  zone  increases  proportionally.  This  is  accomplished  by  resetting 
the  airflow  rate  setpoint  and  modulating  the  damper  to  achieve  this  airflow  rate.  As  the  zone 
temperature  decreases  toward  the  cooling  set  point,  the  airflow  rate  setpoint  is  decreased  and  the 
damper  gradually  closes  until  it  is  providing  the  minimum  flow  rate  necessary  for  ventilation.  If 
the  room  temperature  continues  to  decrease  and  reaches  the  heating  set  point,  the  reheat  valve 
will  begin  to  open.  The  airflow  rate  can  also  be  varied  in  the  heating  mode,  with  the  airflow 
increasing  as  the  temperature  decreases.  Alternatively,  a higher  fixed  airflow  rate  may  be 
specified  for  heating  operation  to  improve  the  distribution  of  the  warm  air.  In  Figure  2.5,  it  is 
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assumed  that  a fixed  airflow  rate  associated  with  the  ventilation  requirement  of  the  room  is 
provided  in  the  heating  mode. 


Valve  Airflow 


Figure  2.5:  Damper  and  valve  control  sequence  as  a function  of  room  temperature  for  a 
single  duct  pressure-independent  VAV  box  with  hydronic  reheat. 

2.2.4  Cl  SUM  Applied  to  VAV  Box  Diagnostics 

The  previous  section  described  one  particular  VAV  box  control  strategy.  However,  a wide 
variety  of  control  strategies  are  employed  by  controller  manufacturers,  most  of  which  use  a 
cascaded  control  loop  to  maintain  the  zone  temperature  and  zone  airflow  rate  at  setpoint  values. 
In  order  to  make  VP  ACC  independent  of  the  control  strategy  used  in  a particular  controller/VAV 
box  application,  four  generic  errors  were  identified:  the  airflow  rate  error,  the  absolute  value  of 
the  airflow  rate  error,  the  temperature  error,  and  the  reheat  coil  differential  temperature  error.  As 
long  as  the  VAV  box  controller  has  an  airflow  setpoint,  as  well  as  heating  and  cooling 
temperature  setpoints,  VPACC  will  function  independently  of  the  control  strategy  used. 
Common  mechanical  and  control  faults  will  result  in  a deviation  of  one  or  more  of  these  errors 
from  its  value  during  normal  operation,  which  can  be  detected  by  a CUSUM  chart. 

The  airflow  rate  error,  Qen-or,  is  defined  as 

Qerror  ~ Qactual  ~ Qsetpoint  (3) 


where 

Qactual  = measured  airflow  rate 
Qsetpoint  - airflow  rate  set  point. 
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The  CUSUMs  of  this  error,  Sq  and  Tq,  are  effective  for  detecting  damper  faults  and  differential 
pressure  sensor  faults  associated  with  airflow  measurement. 

The  absolute  value  of  the  airflow  rate  error,  I QerroX  is  defined  as 


IQ  error!  — IQactual  ~ Q setpoint! 


(4) 


Only  one  CUSUM  value,  S\q\,  is  defined  for  this  error  since  the  error  is  never  negative.  5iqi  is 
effective  for  detecting  unstable  damper  control  faults. 


The  temperature  error,  Terror,  is  defined  as 


Terror  ~~  Tzone  ~~  CSP 

: If  Tzone  > CSP 

(5a) 

Terror  = 0 

: If  HSP  < Tz(me  < CSP 

(5b) 

Terror  = Tzone  ~ HSP 

: If  Tz<me  < HSP 

(5c) 

where 

T 

1 zone 

CSP 

HSP 

= zone  temperature 
= cooling  set  point 
= heating  set  point. 

The  CUSUMs  of  the  temperature  error,  St  and  7>,  are  effective  for  detecting  damper  faults,  valve 
faults,  and  temperature  sensor  faults.  The  specific  definition  of  temperature  error  used  in  this 
report  is  based  on  the  control  sequence  described  above.  Various  other  commonly  used  control 
sequences  may  require  changes  to  the  definitions  of  heating  setpoint,  cooling  setpoint,  and 
temperature  error. 


The  reheat  coil  differential  temperature  error,  ATerron  is  defined  as 


ATe 

ATP 


— T discharge  P entering 

= 0 


If  Uhc  = 0 

If  Uhc  ± 0 


(6a) 

(6b) 


where 

T discharge  = discharge  air  temperature  (the  temperature  of  the  air  leaving  the  reheat  coil) 

T entering  ~ entering  air  temperature  (the  temperature  of  the  air  entering  the  reheat  coil) 

= control  signal  to  the  reheat  coil  valve. 

The  positive  CUSUM  of  the  reheat  coil  differential  temperature  error.  Sat,  is  effective  for 
detecting  a leaking  reheat  coil  valve  fault.  The  negative  CUSUM,  Tat , is  effective  for  detecting 
temperature  sensor  faults.  The  leaking  valve  fault  highlights  the  advantages  of  automated  FDD. 
Without  VP  ACC,  the  local  controller  may  be  capable  of  masking  this  fault  by  increasing  the 
airflow  rate  into  the  space.  In  this  scenario  there  will  be  no  “too  hot”  or  “too  cold”  complaints,  so 
a significant  energy  penalty  may  be  accrued. 
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The  errors  and  CUSUMs  are  only  calculated  during  occupied  periods.  During  unoccupied 
periods,  the  errors  are  not  computed,  and  the  CUSUMs  are  reset  to  zero.  The  first  hour  of  the 
occupied  period  is  treated  the  same  as  the  unoccupied  period,  to  allow  steady  state  conditions  to 
develop. 

2.2.5  Point  requirements 

Most  of  the  points  required  by  VP  ACC  are  already  available  in  the  local  VAV  box  controller: 
room  temperature,  cooling  setpoint,  heating  setpoint,  airflow  rate  setpoint,  actual  airflow  rate, 
and  occupancy  status.  Entering  air  temperature  is  typically  not  available,  so  supply  air 
temperature  (available  over  the  control  network  from  the  AHU  controller)  could  be  used.  Many 
VAV  boxes  are  equipped  with  a discharge  air  temperature  sensor,  which  VP  ACC  needs  in  order 
to  calculate  the  reheat  coil  differential  temperature  error.  If  a discharge  air  temperature  sensor  is 
not  available,  a simplified  version  of  VP  ACC  could  be  used,  implementing  the  airflow  rate  error 
and  the  temperature  error  only. 

2.2.6  Parameters 

For  each  process  error  to  which  CUSUM  analysis  is  to  be  applied,  there  is  a set  of  parameters 
that  must  be  known  and/or  specified.  These  are  the  expected  value  of  the  process  error  (x),  the 
process  error  standard  deviation  ( 6 ),  the  slack  parameter  ( k ),  and  the  alarm  limits  for  the  S and  T 
CUSUMs  {hs  and  hj).  For  the  purposes  of  this  study,  the  expected  value  and  standard  deviation 
of  the  process  error  were  determined  by  analysis  of  a short  period  of  fault-free  operation  from  a 
particular  data  source.  CUSUM  analysis  was  performed  for  each  error  using  an  expected  value 
and  standard  deviation  representative  of  the  VAV  boxes  from  each  site.  These  parameters  will  be 
referred  to  as  the  VP  ACC  statistical  parameters  throughout  the  remainder  of  this  paper.  The 
slack  parameter  k = 3 and  alarm  limits  hs  - hj  = 900  are  the  same  for  all  data  sources.  The 
values  for  hs,  hr,  and  k were  determined  heuristically  based  on  results  of  previous  work  [1].  To 
exceed  the  alarm  limit  value  using  one  min  data,  an  error  that  is  five  standard  deviations  from  the 
mean  would  have  to  persist  for  7.5  hours.  When  a CUSUM  does  exceed  the  alarm  limit,  it  is 
reset  to  zero  and  the  calculations  resume.  Each  CUSUM  is  also  reset  to  zero  during  unoccupied 
periods  (and  during  the  first  hour  of  occupancy,  to  allow  steady  state  conditions  to  develop). 
Thus,  the  severity  of  a fault  can  be  established  from  the  number  of  alarms  over  a period  of  time. 

2.2.7  Special  Cases 

2.2.7.1  No  Discharge  Air  Temperature  Sensor 

Many  VAV  boxes  are  equipped  with  a discharge  air  temperature  sensor,  which  VP  ACC  needs  to 
calculate  the  reheat  coil  differential  temperature  error.  If  a discharge  air  temperature  sensor  is  not 
available,  a simplified  version  of  VP  ACC  could  be  used,  implementing  the  airflow  rate  error,  the 
absolute  value  of  the  airflow  rate  error,  and  the  temperature  error  only.  In  this  case,  a leaking 
reheat  coil  valve  (or,  in  the  case  of  electric  reheat,  staged  reheat  enabled  “on”  in  the  cooling 
mode)  would  not  be  detected  unless  it  was  so  extreme  that  the  VAV  box  was  unable  to  maintain 
the  zone  temperature  at  the  set  point,  thereby  causing  alarms  due  to  excessive  values  of  St- 
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2.2.7.2  Pressure  Dependent 

In  some  VAV  boxes,  the  damper  is  controlled  directly  in  response  to  zone  temperature  without 
an  intermediate  determination  of  an  airflow  setpoint.  Qerror  and  \QemJ  do  not  exist  for  a pressure 
dependent  VAV  box.  In  this  case,  a stuck  damper  may  go  undetected.  In  the  case  where  the  zone 
is  overcooled,  the  reheat  coil  valve  will  open  (or  staged  reheat  will  be  enabled  “on”  if  electric 
reheat  is  employed)  and  compensate  for  the  fault,  masking  its  existence.  In  the  case  where  the 
zone  is  undercooled,  the  rising  zone  temperature  may  create  alarms  due  to  excessive  values  of  St- 

2.2.7. 3 No  Reheat 

Some  VAV  boxes  do  not  have  reheat  capabilities.  Others  do  not  have  reheat  available  part  of  the 
year  because  a two  pipe  hydronic  system  is  being  used  for  chilled  water  at  that  time.  Since  the 
VAV  box  cannot  take  any  control  action  to  increase  zone  temperature,  a negative  temperature 
error  does  not  necessarily  indicate  a fault.  In  this  situation,  only  the  St  CUSUM  will  be 
calculated  for  Terror. 

2.2.1  A Dual  Duct 

In  a dual  duct  VAV  box,  there  is  no  reheat  coil  (and  no  electric  reheat).  Instead  there  are  two  air 
inlets,  namely,  a cold  deck  and  a hot  deck.  Each  air  inlet  has  a damper  and  differential  pressure 
sensor.  For  this  arrangement,  two  airflow  errors  {Qerror, hot  and  Qerror, cold)  and  two  absolute  value 
airflow  errors  (I Qerror,  hot\  and  \Qerror,  cold I)  will  be  calculated.  No  ATerror  will  be  calculated  as  there 
is  no  reheat  capability. 
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3 Testing  Environment 

Two  different  testing  environments  were  used  to  evaluate  the  embedded  FDD  tools:  the  NIST 
Virtual  Cybernetic  Building  Testbed  (VCBT)  and  the  Iowa  Energy  Center  Energy  Resource 
Station  (ERS). 

3.1  VCBT 

The  VCBT  is  a simulation-emulation  environment  that  combines  simulations  of  a building  and 
its  HVAC  system  with  actual  commercial  HVAC  equipment  controllers.  It  provides  a way  to 
conduct  tests  under  a wide  variety  of  carefully  controlled  conditions  and  to  compare  the  results 
of  several  different  commercial  products.  Emulation  provides  a test  environment  that  is  closer  to 
a real  building  because  it  uses  real  building  controllers  but,  like  simulation,  it  also  provides 
carefully  controlled  and  reproducible  conditions.  Because  emulation  is  done  in  real  time  it  takes 
much  longer  than  simulation,  making  it  more  difficult  to  test  a broad  range  of  faults  and 
conditions  in  a limited  time.  Details  of  the  VCBT  design  and  operation  are  documented  in  [4]. 

For  this  study,  the  VCBT  was  configured  with  one  AHU  for  each  of  the  three  floors,  designated 
AHU-A,  -B,  and  -C.  AHU-A  and  -C  are  VAV  systems,  each  with  three  VAV  boxes.  AHU-B  is 
a constant  volume  system,  with  three  zone  reheat  coils. 

3.1.1  AHU  Control  Strategies 

The  control  strategies  described  below  reflect  the  logic  that  is  executed  by  the  AHU  controllers. 

3.1.1.1  AHU-A  Control  Strategy 

3. 1.1. 1.1  Fan  Control 

The  supply  air  fan  speed  is  controlled  to  maintain  the  supply  air  pressure  at  a fixed  set  point. 
The  return  air  fan  speed  is  controlled  to  maintain  a constant  difference  between  the  supply  and 
return  air  flow  rates. 

3. 1.1. 1.2  Temperature  Control 

AHU-A  uses  a single  PI  control  loop  to  determine  a temperature  control  signal  to  maintain  the 
supply  air  temperature  at  a fixed  set  point.  Depending  on  the  magnitude  of  the  signal,  it  is 
mapped  to  one  of  three  outputs  which  control  the  heating  coil,  cooling  coil,  and  mixing-box 
dampers.  Additional  logical  sets  the  position  of  the  other  two  outputs  appropriately  depending 
on  which  one  is  active.  For  example,  if  the  heating  coil  valve  is  active,  the  cooling  coil  valve 
will  be  fully  closed  and  the  mixing  box  damper  will  be  set  to  the  minimum  position  (which 
depends  on  the  occupancy  status).  The  outdoor  air  and  the  return  air  enthalpies  are  compared  to 
determine  whether  to  enable  or  disable  economizer  operation. 

3.1. 1.2  AHU-B  Control  Strategy 

3.1. 1.2.1  Fan  Control 

AHU-B  is  a constant  volume  system,  so  the  supply  air  fan  and  return  air  fan  both  operate  at  fixed 
speeds. 
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3.1. 1.2.2  Temperature  Control 

AHU-B  uses  a single  PI  control  loop  to  determine  a temperature  control  signal  to  maintain  the 
supply  air  temperature  at  a fixed  set  point.  Depending  on  the  magnitude  of  the  signal,  it  is 
mapped  to  one  of  three  outputs  which  control  the  heating  coil,  cooling  coil,  and  mixing-box 
dampers.  Additional  logical  sets  the  position  of  the  other  two  outputs  appropriately  depending 
on  which  one  is  active.  For  example,  if  the  heating  coil  valve  is  active,  the  cooling  coil  valve 
will  be  fully  closed  and  the  mixing  box  damper  will  be  set  to  the  minimum  position  (which 
depends  on  the  occupancy  status).  The  outdoor  air  temperature  is  compared  to  a fixed 
changeover  temperature  to  determine  whether  to  enable  or  disable  economizer  operation. 

3.1.1.3  AHU-C  Control  Strategy 

3.1. 1.3.1  Fan  Control 

The  supply  air  fan  speed  is  controlled  to  maintain  the  supply  air  pressure  at  a fixed  set  point. 
The  return  air  fan  speed  is  controlled  to  maintain  a constant  difference  between  the  supply  and 
return  air  flow  rates. 

3.1. 1.3.2  Temperature  Control 

AHU-C  uses  a separate  PI  control  loop  for  each  of  three  outputs  which  control  the  supply  air 
temperature:  the  heating  coil,  cooling  coil,  and  mixing-box  dampers.  The  heating  coil  and 
cooling  coil  outputs  are  controlled  to  maintain  supply  air  temperature  at  a fixed  set  point.  The 
mixing  box  dampers  are  controlled  by  comparing  the  outside  air  and  the  return  air  enthalpies.  If 
the  return  air  enthalpy  is  greater  than  the  outdoor  air  enthalpy,  the  mixing-box  damper  control 
loop  maintains  mixed  air  temperature  at  its  set  point  (also  fixed).  Interlocks,  dead  bands,  and 
time  delays  are  incorporated  to  prevent  simultaneous  heating,  cooling,  and  economizing. 

3.1.2  Embedded  APAR 

APAR  was  embedded  in  the  AHU  controllers  by  adding  additional  logic  to  their  control 
programs  to  execute  the  APAR  algorithm.  The  three  AHU  controllers  are  each  from  a different 
manufacturer. 

The  cooling  coil  valve,  mixing  box  damper,  and  heating  coil  valve  control  signals  are  evaluated, 
along  with  the  occupancy  status,  to  determine  in  which  mode  the  AHU  is  operating.  A binary 
value  (BV)  is  used  to  represent  the  status  of  each  mode.  Exponentially  weighted  moving 
averages  (EWMAs)  are  computed  for  these  control  signals,  as  well  as  for  supply  air  temperature, 
return  air  temperature,  mixed  air  temperature,  outdoor  air  temperature,  supply  air  temperature 
setpoint,  return  air  humidity,  return  air  enthalpy,  outside  air  humidity,  and  outside  air  enthalpy. 
EWMAs  are  used  to  smooth  the  variation  in  the  data  because,  unlike  other  types  of  moving 
averages,  no  historical  data  need  be  stored  in  the  controller’s  memory  - only  the  current  value 
and  EWMA  of  each  measurement  need  to  be  stored.  The  EWMAs  are  reset  when  a mode  switch 
occurs.  A 60  minute  timer  starts  when  the  logic  recognizes  that  the  AHU  is  operating  in  one  of 
the  five  defined  modes  of  operation,  and  is  reset  when  the  AHU  is  no  longer  in  that  mode.  When 
the  timer  expires,  the  rules  for  that  particular  mode  are  evaluated.  Each  rule  is  represented  by  a 
binary  value,  set  to  a value  of  “on”  if  the  rule  is  satisfied  - indicating  a fault,  otherwise  it  is  set  to 
a value  of  “off’.  Rule  28,  regarding  excessive  mode  switches,  is  a special  case,  since  it  is 
evaluated  independently  of  the  modes.  To  evaluate  Rule  28,  a counter  is  incremented  every  time 
a mode  switch  occurs.  A timer  resets  the  counter  to  zero  every  60  minutes.  If  the  value  of  the 


21 


counter  exceeds  the  maximum  number  of  mode  switches,  the  binary  value  representing  Rule  28 
is  set  to  a value  of  “on”  - indicating  a fault,  otherwise  it  is  set  to  a value  of  “off’. 

A stand-alone  program,  BACnet  Data  Source  (BDS)  [4],  was  installed  on  a workstation 
connected  to  the  VCBT  control  network  and  configured  to  trend  the  raw  data  (temperatures, 
valve  positions,  etc.)  as  well  as  intermediate  values  and  results  of  the  APAR  algorithm  (mode 
and  rule  status).  The  raw  data  was  read  once  per  minute,  while  the  APAR  values  were  read  once 
every  five  minutes.  The  EWMAs  were  not  trended,  as  they  could  easily  be  recreated  from  the 
raw  data  and  mode  status  if  necessary.  For  every  fault  detected  by  APAR,  the  raw  data  was 
examined  to  verify  that  the  FDD  results  matched  the  conditions. 

3.1.3  VCBT  Fault  Descriptions 

3.1.3.1  Supply  Air  Temperature  Sensor  Drift 

Supply  air  temperature  sensor  drift  is  introduced  as  a sensor  offset  for  a range  of  0.0  °C  to  ±4.0 
°C  (0.0  °F  to  ±7.2  °F),  applied  linearly  over  a three  week  emulation  period.  If  a controller 
maintains  the  measured  supply  air  temperature  at  the  set  point,  a negative  sensor  offset  would 
result  in  a decreased  actual  supply  air  temperature.  A positive  sensor  offset  would  result  in  an 
increased  actual  supply  air  temperature. 

3.1.3.2  Return  Air  Temperature  Sensor  Drift 

Return  air  temperature  sensor  drift  is  introduced  as  a sensor  offset  for  a range  of  0.0  °C  to  ±4.0 
°C  (0.0  °F  to  ±7.2  °F),  applied  linearly  over  a three  week  emulation  period.  The  return  air 
temperature  sensor  is  used  to  control  the  economizer  operation  for  AHU-A  and  -C.  A negative 
sensor  drift  for  the  return  air  temperature  with  no  change  to  the  humidity  reading  would  result  in 
an  increased  return  air  enthalpy.  A positive  sensor  drift  for  the  return  air  temperature  with  no 
change  to  the  humidity  reading  would  result  in  a decreased  return  air  enthalpy.  Depending  on 
weather  conditions,  this  fault  may  cause  inappropriate  economizer  control. 

3.1.3.3  Mixed  Air  Temperature  Sensor  Drift 

Mixed  air  temperature  sensor  drift  is  introduced  as  a sensor  offset  for  a range  of  0.0  °C  to  ±4.0  °C 
(0.0  °F  to  ±7.2  °F),  applied  linearly  over  a three  week  emulation  period.  If  a controller  maintains 
the  measured  mixed  air  temperature  at  the  set  point,  a negative  sensor  offset  would  result  in  a 
decreased  actual  mixed  air  temperature.  A positive  sensor  offset  would  result  in  an  increased 
actual  mixed  air  temperature. 

3.1.3.4  Outdoor  Air  Temperature  Sensor  Drift 

Outdoor  air  temperature  sensor  drift  is  introduced  as  a sensor  offset  for  a range  of  0.0  °C  to  ±4.0 
°C  (0.0  °F  to  ±7.2  °F),  applied  linearly  over  a three  week  emulation  period.  The  outdoor  air 
temperature  sensor  is  used  to  control  the  economizer  operation  for  AHU-A,  -B,  and  -C.  A 
negative  sensor  drift  for  the  outdoor  air  temperature  with  no  change  to  the  humidity  reading 
would  result  in  an  increased  outdoor  air  enthalpy.  A positive  sensor  drift  for  the  outdoor  air 
temperature  with  no  change  to  the  humidity  reading  would  result  in  a decreased  outdoor  air 
enthalpy.  Depending  on  weather  conditions,  this  fault  may  cause  inappropriate  economizer 
control. 
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3. 1.3.5  Outdoor  Air  Damper  Fault 

The  outdoor  air  damper  faults  are  introduced  by  overriding  the  normal  control  signal  to  the 
damper  with  a control  signal  to  force  the  motor-driven  actuator  to  the  specified  position,  causing 
the  damper  to  stay  at  that  position  throughout  the  emulation  period.  While  emulating  the  outdoor 
air  damper  fault,  the  recirculation  air  and  the  exhaust  air  dampers  follow  normal  operation.  The 
economizer  may  not  operate  correctly  because  of  the  fault,  depending  on  the  outdoor  and  indoor 
conditions. 

3.1.3.6  Recirculation  Air  Damper  Fault 

The  recirculation  air  damper  faults  are  introduced  by  overriding  the  normal  control  signal  to  the 
damper  with  a control  signal  to  force  the  motor-driven  actuator  to  the  specified  position,  causing 
the  damper  to  stay  at  that  position  throughout  the  emulation  period.  While  emulating  the 
recirculation  air  damper  fault*  the  outdoor  air  and  the  exhaust  air  dampers  follow  normal 
operation.  The  economizer  may  not  operate  correctly  because  of  the  fault,  depending  on  the 
outdoor  and  indoor  conditions. 

3. 1.3.7  Economizer  Control  Logic  Fault 

This  fault  is  introduced  by  reversing  the  logic  used  to  decide  whether  the  economizer  or  the 
minimum  ventilation  operation  should  become  active.  The  fan  speed  and  temperature  control 
loops  operate  normally. 

3.1.3.8  Temperature  Sensor  Failure 

A supply  air,  return  air,  mixed  air,  or  outdoor  air  temperature  sensor  fault  is  introduced  by 
disconnecting  the  leads  to  the  appropriate  sensor  terminals  on  the  AHU  controller. 

3.2  ERS 

The  ERS  is  a laboratory  facility  for  HVAC  research  that  has  two  test  VAV  air-handling  systems, 
each  serving  four  test  zones.  The  HVAC  equipment  and  controllers  are  typical  of  that  found  in 
commercial  buildings.  The  VAV  boxes  are  single  duct  throttling  units  having  both  hydronic  and 
electric  reheat  capabilities.  They  were  operated  with  hydronic  reheat  to  produce  the  data  for  this 
study.  The  VAV  boxes  are  well  instrumented;  many  more  points  are  monitored  than  would 
commonly  be  available  in  a commercial  building.  Details  of  the  facility  are  provided  in  [5]. 

3.2.1  Embedded  VPACC 

VP  ACC  was  embedded  in  four  VAV  box  controllers  by  adding  additional  logic  to  their  control 
programs  to  carry  out  the  VPACC  algorithm.  It  was  necessary  to  limit  the  testing  to  a single 
manufacturer’s  controllers  to  enable  communication  between  the  VAV  box  controllers  and  the 
balance  of  the  control  system. 

The  airflow  error,  absolute  value  of  the  airflow  error,  temperature  error,  and  reheat  coil  AT  error 
are  calculated  continuously.  A sixty  minute  timer  is  started  when  the  occupancy  status  goes 
from  unoccupied  to  occupied.  When  the  timer  expires,  the  CUSUM  S and  T cumulative  sums 
(each  sum  is  represented  by  an  analog  value)  are  evaluated  once  per  minute  for  each  of  the 
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errors,  with  the  following  exceptions:  for  the  \QerroA  the  S cumulative  sum  only  is  calculated, 
and  the  ATerror  S and  T cumulative  sums  are  set  to  zero  if  the  reheat  coil  control  valve  is  not 
fully  closed.  There  is  an  alarm  status,  represented  by  a binary  value,  for  each  cumulative  sum.  If 
the  sum  is  greater  than  the  alarm  limit  (h),  the  corresponding  alarm  status  is  set  to  a value  of 
“on”.  If  the  sum  then  falls  below  h,  the  alarm  status  is  set  to  “off’.  There  is  also  an  overall 
VP  ACC  alarm  status,  which  is  set  to  a value  of  “on”  if  any  of  the  cumulative  sum  alarm  statuses 
are  “on”.  If  the  cumulative  sum  alarm  statuses  are  all  “off’  then  the  VP  ACC  alarm  status  is  set 
to  “off’. 

The  operator  interface  software  written  by  the  manufacturer  of  the  VAV  box  controllers  was 
installed  on  a workstation  connected  to  the  ERS  control  network  and  configured  to  trend  the  raw 
data  (temperatures,  humidities,  etc.)  as  well  as  intermediate  values  and  results  of  the  VP  ACC 
algorithm  (S  and  T cumulative  sum  values  and  alarm  statuses).  The  raw  data  and  VP  ACC  values 
were  read  once  per  minute.  For  every  fault  detected  by  VP  ACC,  the  raw  data  was  examined  to 
verify  that  the  FDD  results  matched  the  conditions. 


3.2.2  ERS  Fault  Descriptions 

3.2.2. 1 Damper  Stuck  Partially  Open 

This  fault  is  introduced  by  overriding  the  VAV  box  damper  actuator  to  a fixed  position  that 
produces  a flow  rate  between  the  minimum  and  maximum  specified  for  that  box.  If  the  zone 
airflow  is  lower  than  necessary,  the  zone  temperature  will  drift  above  the  cooling  set  point.  If  the 
zone  airflow  is  higher  than  necessary,  the  controller  will  transition  to  the  heating  mode  and  the 
reheat  coil  valve  will  modulate  to  maintain  the  zone  temperature  at  the  set  point. 

3.2.2.2  Hydronic  Reheat  Coil  Valve  Stuck  Partially  Open 

This  fault  is  introduced  by  overriding  the  VAV  box  hydronic  reheat  coil  valve  actuator  position 
to  allow  a hot  water  flow  rate  of  approximately  2 % to  10  % of  the  maximum  flow  through  the 
coil.  Depending  on  the  zone  conditions  and  the  severity  of  the  fault,  the  stuck  reheat  valve  either 
creates  an  additional  cooling  load  that  the  AHU  must  try  to  remove,  or  it  prevents  the  valve  from 
modulating  to  provide  additional  heating  energy  to  the  zone.  In  the  first  case,  the  controller 
increases  the  airflow  rate  to  the  zone  in  an  attempt  to  compensate  for  the  fault.  If  the  fault  is 
severe,  the  zone  temperature  will  gradually  increase  beyond  the  zone  cooling  set  point.  In  the 
second  case,  the  zone  temperature  will  tend  to  gradually  decrease  below  the  zone  heating  set 
point. 

3.2.2.3  Failed  Differential  Pressure  Sensor 

This  fault  is  introduced  by  disconnecting  both  tubing  leads  to  the  differential  pressure  sensor. 
The  fault  causes  the  VAV  box  damper  to  go  to  the  full  open  position  because  the  flow  sensor 
indicates  an  airflow  rate  of  zero  and  the  control  loop  will  attempt  to  correct  for  this  condition. 
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3.2.2.4  Unstable  Flow  Control 


The  fault  is  implemented  by  altering  a component  of  the  control  logic  that  limits  the  rate  of 
increase  and  decrease  of  the  airflow  control  output.  The  fault  causes  the  VAV  box  damper  to 
oscillate,  thereby  producing  airflow  rates  that  oscillate  about  the  set  point. 
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4 Results 


4.1  VCBT 

The  VCBT  emulated  an  HVAC  system  operating  during  heating  season  (using  February  weather 
data),  swing  season  (using  October  weather  data),  and  cooling  season  (using  July  weather  data). 
A variety  of  sensor,  actuator,  and  control  logic  faults,  along  with  fault  free  conditions,  were 
imposed.  The  results  of  the  FDD  calculations  performed  by  the  AHU  controllers  are  shown  in 
Table  4.1.  A false  positive  is  a false  alarm.  A false  negative  is  an  undetected  fault. 


Table  4.1  VCBT  Embedded  FDD  Result  Summary 


System 

Fault 

Season 

Correct 

False 

Positive 

False 

Negative 

AHU-A 

Supply  Air  Temperature  Sensor  Drift 

Heating 

X 

Recirculation  Air  Damper  Leakage 

Swing 

X 

Outdoor  Air  Temperature  Sensor  Failure 

Swing 

X 

Mixed  Air  Temperature  Sensor  Drift 

Swing 

X 

Supply  Air  Temperature  Sensor  Failure 

Cooling 

X 

Recirculation  Air  Damper  Stuck  Closed 

Cooling 

X 

AHU-B 

Fault  Free 

Heating 

X 

Fault  Free 

Swing 

X 

Return  Air  Temperature  Sensor  Drift 

Swing 

X 

Fault  Free 

Swing 

X 

Economizer  Control  Logic  Fault 

Cooling 

X 

Economizer  Control  Logic  Fault 

Cooling 

X 

AHU-C 

Mixed  Air  Temperature  Sensor  Failure 

Heating 

X 

Return  Air  Temperature  Sensor  Drift 

Swing 

X 

Outdoor  Air  Damper  Stuck  at  Minimum 

Swing 

X 

Recirculation  Air  Damper  Leakage 

Swing 

X 

Return  Air  Temperature  Sensor  Drift 

Cooling 

X 

Mixed  Air  Temperature  Sensor  Drift 

Cooling 

X 

The  AHU  controller  determined  the  correct  fault  status  in  15  of  the  18  cases.  There  were  zero 
false  positives  and  three  false  negatives.  Two  of  the  false  negatives  were  return  air  temperature 
sensor  drift  faults  in  AHU-B  and  AHU-C,  both  in  swing  season.  The  swing  season  weather 
conditions  resulted  in  the  AHUs  operating  in  Modes  2 (cooling  with  outdoor  air)  and  3 
(mechanical  cooling  with  100  % outdoor  air)  most  of  the  time.  However,  the  return  air 
temperature  sensor  drift  fault  was  also  instigated  in  AHU-C  in  cooling  season,  when  the  AHUs 
operated  mostly  in  mode  4 (mechanical  cooling  with  minimum  outdoor  air).  This  time,  the 
return  air  temperature  sensor  drift  was  detected.  This  shows  that  some  faults  can  be  detcted 
under  certain  conditions,  but  not  others.  The  other  false  negative  was  an  outdoor  air  damper  in 
AHU-C  stuck  at  the  minimum  position  during  swing  season.  In  this  case,  the  AHU  controller 
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can  still  adjust  the  ratio  of  return  to  outdoor  air  by  modulating  the  recirculation  and  exhaust 
dampers.  This  is  an  example  of  a fault  that  is  masked  by  the  control  system. 

Several  examples  are  presented  to  illustrate  the  details  of  the  APAR  algorithm. 

4.1.1  Outdoor  Air  Temperature  Sensor  Failure 

This  fault  was  introduced  by  disconnecting  the  leads  to  the  terminals  on  the  AHU-A  controller 
on  which  the  outdoor  air  temperature  sensor  was  connected.  This  is  a 0 - 10  VDC  analog  input 
calibrated  to  a range  of  -6  °C  - 49  °C  (20  °F  - 120  °F),  so  with  the  leads  disconnected,  the 
controller  reads  0 VDC,  which  is  scaled  to  an  outdoor  air  temperature  of  -6  °C  (20  °F). 

Swing  season  is  characterized  by  substantial  variation  in  outdoor  air  temperature  and  humidity. 
Modes  2,  3,  and  4 (cooling  with  outdoor  air,  mechanical  cooling  with  100%  outdoor  air,  and 
mechanical  cooling  with  minimum  outdoor  air,  respectively)  are  all  encountered  during  a fault 
free  three  week  swing  season  emulation.  When  the  fault  is  introduced,  the  controller  reads  an 
outdoor  air  temperature  of  -6  °C  (20  °F).  The  logic  in  the  controller  calculates  outdoor  air 
enthalpy  based  on  outdoor  air  temperature  and  outdoor  air  humidity,  so  the  erroneously  low 
reading  of  outdoor  air  temperature  will  result  in  an  erroneously  low  calculated  value  of  outdoor 
air  enthalpy.  The  AHU-A  controller  compares  the  calculated  value  of  outdoor  air  enthalpy  to 
return  air  enthalpy  (calculated  in  a similar  fashion  based  on  return  air  temperature  and  return  air 
humidity)  and  enables  economizer  operation  if  the  outdoor  air  enthalpy  is  less  than  the  return  air 
enthalpy.  The  erroneously  low  calculated  value  of  outdoor  air  enthalpy  will  cause  the  controller 
to  enable  economizer  operation,  even  if  such  operation  is  inappropriate. 

Figures  4.1  and  4.2  show  data  from  AHU-A  from  the  occupied  portion  of  one  day  during  the 
emulation  of  this  fault.  Since  the  actual  outdoor  air  temperature  is  greater  than  the  supply  air 
temperature  setpoint,  the  mixing  box  dampers  (green,  figure  4.1)  will  saturate  at  the  100  % 
outdoor  air  position,  and  the  AHU  controller  will  modulate  the  cooling  coil  valve  (blue,  figure 
4.1)  to  maintain  the  supply  air  temperature  at  its  setpoint.  The  heating  coil  valve  (red,  figure  4.1) 
is  closed.  Based  on  this  combination  of  control  signals,  APAR  determines  the  system  to  be 
operating  in  Mode  3 (mechanical  cooling  with  100  % outdoor  air)  and  evaluates  the  applicable 
rule  set.  One  of  the  rules  in  the  set  for  Mode  3 is  Rule  8,  which  states  that  if  the  outdoor  air 
temperature  (green,  figure  4.2)  is  less  than  the  supply  air  temperature  setpoint  (not  shown, 
constant  12.8  °C  (55  °F))  minus  the  temperature  rise  across  the  supply  fan  (1.1  °C  (2  °F))  by 
more  than  a certain  threshold  (1.7  °C  (3  °F)),  then  a fault  has  been  detected.  Another  rule  in  the 
set  for  Mode  3 is  Rule  10,  which  states  that  if  the  absolute  value  of  the  difference  between 
outdoor  air  temperature  and  mixed  air  temperature  (purple,  figure  4.2)  is  greater  than  a certain 
threshold  (1.7  °C  (3  °F)),  then  a fault  has  been  detected.  On  the  day  shown  in  figures  4.1  and 
4.2,  Rules  8 and  10  are  satisfied,  indicating  that  this  fault  has  been  successfully  detected.  The 
fault  status  data  collected  from  the  AHU-A  controller  indicates  that  this  fault  was  detected  on 
this  particular  day  of  operation. 

4.1.2  Mixed  Air  Temperature  Sensor  Drift 

This  fault  is  introduced  as  a sensor  offset  beginning  at  0 °C  and  increasing  linearly  over  a three 
week  emulation  period  to  +4  °C.  The  positive  sensor  drift  means  that  the  measured  mixed  air 
temperature  is  greater  than  the  actual  mixed  air  temperature  by  the  amount  of  the  offset. 
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Figure  4.1  VCBT  AHU-A  Outdoor  Air  Temperature  Sensor  Failure  - Control  Signals 
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Figures  4.3  and  4.4  show  AHU-C  data  from  the  occupied  portion  of  one  day  during  the 
emulation  of  this  fault.  On  this  particular  day,  the  outdoor  air  temperature  (green,  figure  4.4)  and 
humidity  (not  shown)  are  high  enough  to  prohibit  economizing,  as  they  typically  are  during 
cooling  season,  so  the  mixing  box  dampers  (green,  figure  4.3)  are  aligned  to  bring  in  the 
minimum  amount  of  outdoor  air  needed  for  ventilation.  The  AHU  controller  modulates  the 
cooling  coil  valve  (blue,  figure  4.3)  to  maintain  the  supply  air  temperature  (light  blue,  figure  4.4) 
at  its  setpoint  (dark  blue,  figure  4.4).  The  heating  coil  valve  (red,  figure  4.3)  is  closed. 

Based  on  this  combination  of  control  signals,  APAR  determines  the  system  to  be  operating  in 
Mode  4 (mechanical  cooling  with  minimum  outdoor  air)  and  evaluates  the  applicable  rule  set. 
One  of  the  rules  in  the  set  for  Mode  4 is  Rule  26,  which  states  that  the  mixed  air  temperature 
(brown,  figure  4.4)  should  be  between  the  return  air  temperature  (red,  figure  4.4)  and  outdoor  air 
temperature  (green,  figure  4.4).  If  the  mixed  air  temperature  is  below  the  lesser  of  the  return  air 
or  outdoor  air  temperature,  or  above  the  greater  of  the  return  air  or  outdoor  air  temperature,  by  a 
certain  threshold  (1.7  °C  (3  °F)),  then  a fault  has  been  detected.  The  actual  mixed  air 
temperature  is,  in  fact,  between  the  return  air  and  outdoor  air  temperature,  since  it  is  the  result  of 
blending  the  outdoor  air  and  return  air  streams.  However,  due  to  the  sensor  drift,  the  measured 
mixed  air  temperature  is  below  the  return  air  temperature  (the  lesser  of  the  return  air  and  outdoor 
air  temperature)  by  approximately  3 °C  (5.4  °F).  Rule  26  is  satisfied,  indicating  that  this  fault 
has  been  successfully  detected.  The  fault  status  data  collected  from  the  AHU-C  controller 
indicates  that  this  fault  was  detected  on  this  particular  day  of  operation. 

4.1.3  Recirculation  Damper  Stuck  Closed 

When  this  fault  is  introduced,  the  AHU  controller  calculates  the  desired  position  of  the 
recirculation  damper  and  sets  the  damper  control  signal  normally.  Within  the  emulation,  the 
damper  position  is  set  to  the  fully  closed  position,  corresponding  to  100  % outdoor  air 
throughout  the  emulation  period.  During  emulation  of  this  fault,  the  outdoor  air  and  exhaust  air 
dampers  follow  normal  operation. 

Figures  4.5  and  4.6  show  data  from  AHU-A  from  the  occupied  portion  of  one  day  during  the 
emulation  of  this  fault.  On  this  particular  day,  the  outdoor  air  temperature  (green,  figure  4.6)  and 
humidity  (not  shown)  are  high  enough  to  prohibit  economizing,  as  they  typically  are  during 
cooling  season,  so  the  mixing  box  dampers  (green,  figure  4.5)  are  commanded  to  bring  in  the 
minimum  amount  of  outdoor  air  needed  for  ventilation.  However,  the  recirculation  damper  is 
stuck  closed,  so  the  actual  outdoor  air  flow  rate  is  greater  than  the  minimum  required.  The 
qualitative  effect  of  the  stuck  recirculation  damper  can  be  seen  by  comparing  the  mixed  air 
temperature  (brown,  figure  4.6)  to  the  return  (red,  figure  4.6)  and  outdoor  air  temperatures 
(green,  figure  4.6).  If  the  dampers  were  aligned  correctly,  the  mixed  air  temperature  should  be 
very  close  to  the  return  air  temperature,  but  it  is  actually  much  closer  to  the  outdoor  air 
temperature,  due  to  the  excessive  amount  of  outdoor  air  being  drawn  into  the  AHU.  The  AHU 
controller  modulates  the  cooling  coil  valve  (blue,  figure  4.5)  to  maintain  the  supply  air 
temperature  (light  blue,  figure  4.6)  at  its  setpoint  (dark  blue,  figure  4.6).  The  heating  coil  valve 
(red,  figure  4.5)  is  closed. 

Based  on  this  combination  of  control  signals,  APAR  determines  the  system  to  be  operating  in 
Mode  4 (mechanical  cooling  with  minimum  outdoor  air)  and  evaluates  the  applicable  rule  set. 
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Figure  4.3  VCBT  AHU-C  Mixed  Air  Temperature  Sensor  Drift  - Contro!  Signals 


Figure  4.4  VCBT  AHL-C  Mixed  Air  Temperature  Sensor  Drift  - Process  Temperatures 
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One  of  the  rules  in  the  set  for  Mode  4 is  Rule  18,  which  first  checks  whether  there  is  enough  of  a 
difference,  5.6  °C  (10  °F),  between  the  return  and  outdoor  air  temperatures  in  order  to  proceed. 
In  the  case  illustrated  by  figures  4.5  and  4.6,  the  difference  becomes  sufficient  from 
approximately  250  minutes  after  the  beginning  of  the  occupied  period  until  the  end  of  the 
occupied  period.  If  the  difference  is  sufficient,  the  next  step  in  the  evaluation  of  Rule  18  is  to 
calculate  the  fraction  of  outdoor  air  by  dividing  the  difference  between  mixed  air  and  return  air 
temperature  by  the  difference  between  outdoor  air  and  return  air  temperature.  If  the  calculated 
outdoor  air  fraction  varies  by  more  than  a specified  threshold,  0.30,  from  the  minimum  amount 
of  outdoor  air  needed  for  ventilation  (in  this  case,  0.15),  Rule  18  is  satisfied,  indicating  that  this 
fault  has  been  detected.  The  fault  status  data  collected  from  the  AHU-A  controller  indicates  that 
this  fault  was  detected  on  this  particular  day  of  operation. 

4.1.4  Economizer  Control  Logic  Fault 

When  this  fault  is  introduced,  the  logic  used  to  select  economizer  or  minimum  ventilation 
operation  is  reversed.  The  AHU-B  controller  compares  the  outdoor  air  temperature  to  a fixed 
changeover  temperature,  22.2  °C  (72  °F),  to  determine  whether  to  enable  or  disable  economizer 
operation.  Normally,  economizer  operation  is  disabled  if  the  outdoor  air  temperature  is  greater 
than  the  changeover  temperature  and  enabled  if  the  outdoor  air  temperature  is  less  than  the 
changeover  temperature,  subject  to  a deadband  of  1.1  °C  (2.0  °F).  When  the  economizer  fault  is 
implemented,  this  relationship  is  reversed:  economizer  operation  is  enabled  if  the  outdoor  air 
temperature  is  greater  than  the  changeover  temperature  and  disabled  if  the  outdoor  air 
temperature  is  less  than  the  changeover  temperature,  still  subject  to  the  specified  deadband. 

Figures  4.7  and  4.8  show  data  from  AHU-B  from  the  occupied  portion  of  one  day  during  the 
emulation  of  this  fault.  On  this  particular  day,  the  outdoor  air  temperature  (green,  figure  4.8) 
ranges  from  26  °C  (79  °F)  to  33  °C  (91  °F),  which  is  greater  than  the  changeover  temperature 
(orange,  figure  4.8)  by  more  than  the  deadband  as  is  typically  the  case  for  cooling  season.  If 
AHU-B  was  operating  without  any  faults,  these  conditions  would  cause  the  AHU  controller  to 
disable  the  economizer  and  align  the  mixing  box  dampers  to  bring  in  the  minimum  amount  of 
outdoor  air  needed  for  ventilation.  Due  to  the  control  logic  fault,  the  mixing  box  dampers 
(green,  figure  4.7)  are  actually  aligned  to  bring  in  100  % outdoor  air.  Qualitatively,  this  can  be 
seen  by  comparing  the  mixed  air  temperature  (brown,  figure  4.8)  to  the  return  (red,  figure  4.8) 
and  outdoor  air  temperatures  (green,  figure  4.8).  If  the  dampers  were  aligned  for  minimum 
outdoor  air,  the  mixed  air  temperature  would  be  very  close  to  the  return  air  temperature,  but 
actually  the  mixed  air  temperature  is  nearly  identical  to  the  outdoor  air  temperature,  because  the 
fault  in  the  AHU  control  logic  has  positioned  the  dampers  to  bring  in  100  % outdoor  air.  The 
AHU  controller  modulates  the  cooling  coil  valve  (blue,  figure  4.7)  to  maintain  the  supply  air 
temperature  (light  blue,  figure  4.8)  at  its  setpoint  (dark  blue,  figure  4.8).  The  heating  coil  valve 
(red,  figure  4.7)  is  closed. 

Based  on  this  combination  of  control  signals,  APAR  determines  the  system  to  be  operating  in 
Mode  3 (mechanical  cooling  with  100  % outdoor  air)  and  evaluates  the  applicable  rule  set.  One 
of  the  rules  in  the  set  for  Mode  3 is  Rule  9,  which,  as  modified  to  match  the  absolute 
temperature-based  economizer  control  strategy  employed  for  AHU-B,  states  that  if  the  outdoor 
air  temperature  (green,  figure  4.8)  is  greater  than  the  changeover  temperature  (orange,  figure 
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Figure  4.5  VCBT  AHU-A  Recirculation  Damper  Stuck  Closed  - Control  Signals 
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Figure  4.6  VCBT  AHC-A  Recirculation  Damper  Stuck  Closed  - Process  Temperatures 
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4.8)  plus  the  deadband  (1.1  °C  (2.0  °F))  by  more  than  a specified  threshold  (1.7  °C  (3  °F)),  a 
fault  has  been  detected.  The  fault  status  data  collected  from  the  AHU-B  controller  indicates  that 
this  fault  was  detected  on  this  particular  day  of  operation. 
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Figure  4.7  VCBT  AHU-B  Economizer  Control  Logic  Fault  - Control  Signals 


Figure  4.8  VCBT  AHU-B  Economizer  Control  Logic  Fault  - Process  Temperatures 
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4.2  ERS 


The  supply  air  temperature  from  the  AHU  controller  was  made  available  to  each  of  the  VAV  box 
controllers  via  the  control  network  and  was  used  as  the  entering  air  temperature  in  the  calculation 
of  reheat  coil  differential  temperature  error.  To  establish  the  VP  ACC  statistical  parameters,  three 
days  of  normal  operation  data  were  collected  at  one-minute  sampling  intervals  from  four  VAV 
box  controllers  during  the  cooling  season.  The  data  were  processed  off-line  and  yielded  the 
parameters  in  Table  4.2.  The  mean  calculated  for  each  of  the  errors  was  used  for  x in  the 
VP  ACC  algorithm,  and  the  standard  deviation  was  used  for  6 . During  testing  of  various  fault 
conditions,  online  inspection  of  the  output  from  VP  ACC  showed  the  output  to  be  consistent  with 
what  was  expected.  That  is,  the  dominant  CUSUM  value  (or  values)  for  each  data  set  was 
appropriate  for  the  implemented  fault. 


Table  4.2  VPACC  statistical  parameters  for  the  ERS  data  sets 


Error 

Mean 

Standard  Deviation 

Qerror 

3.30xl0‘3  nvVs  (7.00  CFM) 

1.32xl0‘2  mJ/s  (28.0  CFM) 

Terror 

0.59  °C  (1.07  °F) 

0.36  °C  (0.65  °F) 

^Terror 

1.11  °C  (2.00  °F) 

0.34  °C  (0.62  °F) 

Results  obtained  for  this  data  set  are  shown  in  Table  4.3.  Twenty-seven  days  of  normal  operation 
data  were  processed  with  VPACC  with  no  false  alarms.  The  reheat  coil  valve  stuck  partially 
open  fault  was  implemented  for  four  days  with  different  severities.  Significant  differences 
between  the  entering  and  discharge  air  temperatures  produced  17  alarms  of  Sat-  The  failed 
differential  pressure  sensor  produced  large  negative  airflow  errors,  leading  to  9 alarms  of  Tq  and 
5|g|.  Similarly,  the  stuck  open  damper  fault  produced  large  airflow  errors,  the  signs  of  which 
were  determined  by  the  loads  on  the  zone. 

The  unstable  airflow  fault  was  implemented  for  four  days  with  different  severities  and  produced 
6 alarms  of  S\q\.  There  was  one  day  of  testing  that  did  not  produce  any  alarms  because  the  fault 
was  not  severe  enough.  Figure  4.9  shows  the  airflow  error  for  one  day  of  testing  when  the  fault 
was  more  severe.  On  this  particular  day,  the  standard  deviation  of  Qerror  was  0.11  m ’/s  (233 
CFM),  which  is  more  than  eight  times  the  standard  deviation  of  the  data  for  normal  operation. 
VPACC  output  is  shown  in  Figure  4.10.  The  fault  is  apparent  from  S\q\,  but  not  from  Sq  or  Tq. 


Table  4.3  VPACC  results  for  the  ERS  data  sets 


\ Operation 

Number 
of  Days 

Number  of  Alarms 

Sq 

Tq 

Sj 

Tt 

Sat 

Tat 

^101 

Normal 

27 

0 

0 

0 

0 

0 

0 

0 

Reheat  Coil  Valve  Stuck  Partially  Open 

4 

0 

0 

0 

0 

17 

0 

0 

Failed  DP  Sensor 

3 

0 

9 

0 

0 

0 

0 

9 

Unstable  Airflow 

4 

0 

0 

0 

0 

0 

0 

6 

Damper  Stuck  Open 

2 

4 

7 

0 

0 

0 

0 

11 
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Figure  4.9  ERS  data  showing  the  effect  of  an  unstable  airflow  fault. 


Data  Point  (1-min.  Intervals) 

Figure  4.10  VPACC  output  corresponding  to  the  conditions  in  Figure  4.9. 
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5 Summary  and  Future  Work 

The  purpose  of  this  report  is  to  present  the  results  of  an  investigation  into  embedding  APAR,  a 
rule  based  FDD  tool  for  AHUs,  and  VP  ACC,  a statistical  quality  control  based  FDD  tool  for 
VAV  boxes,  into  A1IU  and  VAV  box  controllers,  respectively.  APAR  consists  of  a set  of  expert 
rules,  derived  from  mass  and  energy  balances.  Control  signals  are  used  to  determine  the  AHU’s 
mode  of  operation,  which  identifies  the  subset  of  the  rules  to  be  evaluated.  VP  ACC  uses  a small 
set  of  process  errors,  valid  for  most  VAV  box  control  strategies,  to  measure  VAV  box 
performance.  CUSUM  charts,  a statistical  quality  control  tool,  are  used  to  evaluate  the  process 
errors.  Thresholds  are  determined  by  statistical  analysis  of  a database  of  “normal  operation” 
data. 

APAR  was  evaluated  in  an  emulation  environment,  while  VP  ACC  was  evaluated  in  a laboratory 
environment.  Consistent  results  detecting  a variety  of  common  mechanical  and  control  faults 
show  that  the  FDD  tools  are  both  effective  at  detecting  these  faults  and  are  suitable  for 
embedding  in  commercial  HVAC  equipment  controllers. 

In  a parallel  study,  the  FDD  tools  were  evaluated  using  trend  data  from  a number  of  field  sites. 
Follow-on  work  will  require  partnering  with  control  system  manufacturers  to  conduct  field  tests 
of  APAR  and  VP  ACC,  embedded  in  their  own  controller  products.  NIST’s  vision  of  full 
commercialization  of  automated  fault  detection  and  diagnostics  is  one  in  which  APAR  and 
VP  ACC,  along  with  appropriate  parameters  and  thresholds,  are  packaged  within  HVAC  control 
products.  In  order  for  this  vision  to  become  reality,  more  work  is  needed  in  three  main  areas. 
First,  it  is  impractical  to  expect  trend  data  to  be  evaluated  to  determine  the  necessary  parameters 
and  thresholds  for  each  site,  as  was  done  in  this  study.  Ideally,  sets  of  robust  parameters  and 
thresholds  that  are  effective  across  specified  ranges  of  applications  would  be  identified. 
Additional  field  data  from  a wide  variety  of  systems  must  be  collected  in  order  to  determine 
these  robust  parameters  and  thresholds.  Also,  the  current  embedded  FDD  tools  are  written  using 
generic  mathematical  functions  available  in  the  languages  in  which  the  controllers  are 
programmed.  Although  this  approach  is  suitable  for  a technology  demonstration,  built-in  APAR 
and  VP  ACC  functions  would  greatly  simplify  the  task  of  embedding  FDD  in  a control  program. 

Finally,  more  work  is  needed  to  develop  alternative  ways  to  interpret  FDD  results  and  deliver 
this  information  to  the  building  operator.  The  most  direct  approach  is  to  generate  an  alarm  that 
the  operator  must  acknowledge  whenever  a rule  is  violated  (APAR)  or  a cumulative  sum  exceeds 
the  alarm  limit  (VP ACC).  Refinements  to  the  basic  scheme  are  possible.  For  example,  rather 
than  automatically  sending  the  alarm  to  the  operator,  the  building  control  system  could  highlight, 
on  demand,  those  devices  having  experienced  the  greatest  number  of  alarms  in  a given  period  of 
time.  Or,  if  an  automated  maintenance  management  system  is  used,  an  alarm  could  automatically 
generate  an  appropriate  work  order.  However,  many  faults  are  the  result  of  design  or 
commissioning  issues  that  are  beyond  the  scope  of  the  building  maintenance  staff.  Furthermore, 
a fault  in  another  piece  of  equipment,  such  as  an  air  handling  unit,  boiler,  or  chiller,  could  result 
in  this  approach  generating  a large  number  of  alarms,  perhaps  overwhelming  the  operator.  A 
mechanism  is  needed  to  resolve  multiple  conflicting  fault  reports  before  reporting  them  to  the 
operator. 
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